perm filename RFNDIR[154,RWF] blob
sn#856247 filedate 1988-04-21 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00007 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 EXTERNAL MEMORY
C00005 00003 Discs
C00011 00004 Coordinates
C00015 00005 \rm
C00022 00006 The problem is now in the classic form for a recursive function. We have a set
C00025 00007 To prepare ordinary English text for typography by TEX, begin by entering
C00039 ENDMK
C⊗;
EXTERNAL MEMORY
Tapes
The tapes at LOTS have a density of 600 bytes per inch, and a speed of 45
inches per second; this gives a maximum information transfer rate,
ignoring start/stop times and gaps, of 27000 bytes/sec. At 5 bytes/word,
this implies a time of 85 microsec/word; ordinarily, the inner loop of a
sorting algorithm will be much faster than this, so tape sorting
algorithms will be limited entirely by tape speeds, and sophisticated
algorithms will be economical. Information is stored in records of at
most 30,700 bytes per record, followed by a 0.6 inch gap; crossing the gap
takes 13333 microseconds, adding an average of 2 microseconds/word for
data stored in long records. Tapes are 2400 feet long; this gives a
capacity of 44.67 megabytes, or 8,934 million words.
Faster tape drives with higher density are available; at 6250 bytes/inch
and 125 inches/sec, the speed and capacity parameters are 781250 bytes/sec
= 156250 words/sec = 6.4 microsec/word. At this speed, even a simple
sorting algorithm probably can't keep up with the tape, and a fast inner
loop becomes very important.
Both types of tape can read either forward or backward, but only write
forward. After a write, all data to the right of the new information is
lost, so ordinarily one thinks of writing only at the right end of the
tape. (Some tapes are not so restricted, e.g. Dectapes.) Start/stop time
is ______________; it can be avoided if the program can keep up with the
tape (?)
Discs
A disc memory can b seen on many levels. As a mechanical device, it has a
certain set of parameters (speed, capacity, access times, constraints).
When combined with its electronic controller the parameters are changed,
never for the better. When incorporated into a computer, the parameters
may change again. The adoption of a file-handling system and operating
system changes them once more. We shall look at the LOTS discs at each
level.
At LOTS, the discs are of type 3330 (the same as at SCORE). There are 4
disc drives, each of 19 surfaces, 800 tracks per surface, 20 sectors per
track, 640 bytes (or 128 words) per sector. Rotation time is 1/60 sec =
16,667 microseconds. There is one arm per surface, which must be moved
firm track to track. All arms move together; at any one time, all arms
are on the same cylinder of tracks (?) Seek time for moving from one
track to another is a complicated function of track number; it is never
less than ______, or more than ______, and averages 30 ms.
At the hardware level, then, a track contains 20 x 640 = 12,800 bytes or
2560 words, which can be read, starting anywhere on the track, at a rate
of 60 x 12800 = 768,000 bytes/sec, or 60 x 2560 = 153,600 words/sec.; that
is, 1,302 microsec/byte, 6.510 microsec/word. This is fast enough to keep
up with most sorting algorithms.
One can avoid seek times by staying within one cylinder; a cylinder
contains 19 x 12800 = 243,200 bytes = 48,640 words. An entire disc
contains 800 x 243,200 bytes = 194.56 Mbytes = 38.912 Mword. (For
comparison, the Palo Alto White Pages contain about 160,000 entries
averaging about 50 characters, or 8 Mbytes, 1/25 as much.)
At the operating system level, a programmer normally is sharing the disc
drive with many other users; his disc requests find the discs and arms in
almst random positions, except that central tracks are most likely. His
requests for certain tracks will be queued and processed by an algorithm
which alternates going outward and inward much as an elevator alternates
handling all upward requests with all downward ones. At the operating
system, information is organized into pages of four consecutive sectors =
2560 bytes = 512 words; all information transfer is by full pages.
Successive pages of a file are usually, but not always, consecutive on the
disc. When a track is reached, reading always starts at the first word of
the tracks.
At the machine/assembly language level, disc pages are an addressable
virtual memory. Some pages are resident in fast memory, by whim of the
operating system. Nonresident pages may be in a disc file, or in
``swapping space'' (the most frequently used central tracks) on the disc.
A program can give the operating system advance warning of its need for
the next page in a file. Disc accesses are optimized for seek time but
not for rotational time.
At the Pascal level, a Pascal file is mapped into 32 consecutive pages of
the virtual address space, which is then treated as described above.
Another type of disc unit is the IBM 3350, with capacity of 317 Mbytes.
It has ________ cylinders, 30 tracks per cylinder, 19 K bytes per track.
A newer type is the 3380, with 2.5 G bytes of storage. It has 4
independently moving arms, each with its own set of cylinders; in effect,
it is equivalent to 4 disc drives, each of 625 Mbyte.
The cost of disc storage is about 10 Kbytes/dollar; at this cost, it is
economically feasible to permanently retain all programs, text, and ther
data produced by humans, buying more disc drives when old ones fill.
Coordinates
In drawing graphs of functions of one variable, it is customary to use the
horizontal dimension to represent the independent variable, usually called
x, increasing from left to right. The vertical dimension represents the
dependent variable, usually called y, increasing from bottom to top. We
will call such coordinate systems graph coordinates. In graph coordinates,
the points shown below have the coordinate pairs written next to them.
It is customary to identify array elements, however, by a different convention,
as shown by the example below:
A11 A12 A13 A14
A21 A22 A23 A24
A31 A32 A33 A34
Here the first coordinate, often called i, is vertical and increases from top
to bottom; the second, often called j, is horizontal, and increases from left
to right. We will call such coordinate systems array coordinates. Also, in
array coordinates, the coordinates usually refer to the squares of the grid,
rather than the corners. In an array coordinate system, the shaded squares
shown below have the coordinates shown next to them.
Comparing the two illustrations, notice that the shift between array and graph
coordinates roughly corresponds to a 90 degree rotation of the image.
In computing, it is customary to use array coordinates in referring to arrays
and to input and output pages. In this text, we usually use R (for row) as
the vertical coordinate, and C (for column) as the horizontal on the printed
page. Array coordinates correspond to the order in which most computer
printers print the characters on the page, with lines from top to bottom
and characters left to right within the line. A procedure call to put an
asterisk as character 9 of line 5 of a page will normally be P('*',9,5).
Your programs will usually be most intelligible to other readers if you
follow this convention. Your typical page printing operation will be:
FOR R:=1 TO 60 DO
BEGIN
FOR C:=1 TO 132 DO
WRITE(IMAGE[R,C]);
WRITELN
END
\rm
\font\sevenrm=cmr7
\footnote{}{\sevenrm\noindent integs.tex[1,rfn]}
\centerline{Adaptive Integration: a Study in Recursion}
\vskip .125in
\centerline{R.\ W.\ Floyd}
\centerline{$\copyright$ 1983}
Suppose I want to evaluate $\int↑1↓0e↑{-x↑2}dx$. I look in a table of integrals;
it isn't there. I ask a mathematician; he says there is no formula for
$\int e↑{-x↑2}dx$. (More precisely, no formula using the elementary functions;
the integral can be expressed in terms of the {\sl error function},
$\hbox{erf}(x)$, for
which there are tables. Let's pretend someone has checked out the library's
only copy.) I will have to approach the problem computationally. One method
is to approximate the function by narrow trapezoids or rectangles
\vskip 2in
\noindent
as shown in the figure (in practice, of course, we would use narrower ones).
That is, I can break up $\int↑b↓af(x)dx$ into $\int↑{x↓1}↓{x↓0}f(x)dx
+\int↑{x↓2}↓{x↓1}f(x)dx+\cdots+\int↑{x↓n}↓{x↓{n-1}}f(x)dx$, where
$x↓0=a$, $x↓n=b$, and $x↓0$, $x↓1$, $x↓2,\cdots ,x↓n$ are closely spaced.
Then I can approximate $\int↑{x↓{i+1}}↓{x↓{i}}f(x)dx$
by $0.5(f(x↓i)+f(x↓{i+1}))(x↓{i+1}-x↓i)$,
or by $f(x↓i)(x↓{i+1}-x↓i)$, or by $f(0.5(x↓i+x↓{i+1}))(x↓{i+1}-x↓i)$, or
something similar.
These methods of numerical integration are in regular use; there are some
analogous methods that tend to be more accurate than the ones above, but the
principle is the same.
We can't be entirely sure, though, that such a method gives an accurate answer.
If I try to evaluate $\int↑1↓0\cos (200πx)dx$ by evaluating $\cos (200πx)$ at
$x=0,0.01$, $0.02,\cdots ,0.99,$ $1.00$, and then using a formula like the ones
above, I find the integrand is 1.0 at each value of $x$ I try, and the estimate
I come up with will be $\int↑1↓0\cos (200πx)dx=1.0$; the true value, though,
is zero. By dividing the interval into smaller sections, I could get a more
accurate answer--but how can I be sure it's accurate enough?
\vfill\eject
I have an idea. If I can get one estimate that I know is too large, and another
that I know is too small, and if they differ by less than 0.01, then either one
(or their average) is an accurate enough answer. For an estimate I know is too
small, I use the trapezoid shown below:\par\vskip 3.5in\par
Since $e↑{-x↑2}$ has a negative second derivative in (0,1), it lies above the
straight line through the endpoints.
I'll call the area of the trapezoid
$T(0,1)$; $T(0,1)=0.5(f(0)+f(1))=0.5(1.0000+0.3679)=0.6839$.
To get an estimate that I know is too large, I find a trapezoid tangent to
$e↑{-x↑2}$ at the midpoint, as shown below:\par\vskip 3.5in
\vfill\eject
Because the second derivative is negative, the function stays below the straight
line. I'll call the area $M(0,1)$; it is equal to $f(1/2)=0.7788$.
Notice that it does not depend on the slope of the tangent.
Unfortunately, $0.7788-0.6839=0.0948$, which is much too big. But I can see that
if I subdivide the interval, and use the same technique on each of the smaller
intervals, I may get a more accurate result:\par\vskip 3.5in\par\noindent
Naturally, when I subdivide, I can't allow the error of each half to be as
much as 0.01, but if I require that each half be within 0.005 of the correct
value, the sum will be correct within 0.01.
What if one of the subdivisions still doesn't give me an accurate enough result?
I can cut it in half again, and again; in each region I can subdivide as much
as I need to get an accurate result. I will probably need to divide smaller
near $x=0$, where the second derivative is large, than near $x=1$, where it
is small.
The diagram below sums up the whole computation. Each box has a value of $A$,
$C$, and $L$; its job is to evaluate $\int↑C↓Af(x)dx$ with an error no bigger than
$L$. To do that, it calculates an
estimate\par\noindent\hbox{$T=0.5(f(A)+f(C))(C-A)$}, which is too
small, and (letting $B=0.5(A+C)$) an estimate $M=f(B)(C-A)$ which is too large.
If $T-M<L$, it calculates $Area$ $=0.5(T+M)$; otherwise, it breaks the interval
into two subintervals, estimates the area of each within a tolerance $L/2$, and
adds the results.
\par\vfill\eject
The problem is now in the classic form for a recursive function. We have a set
of parameters to define the particular problem; here, they are $A$, $C$, and $L$.
We have a direct way to solve the problem if it is ``small'' enough; here,
if $C-A$ is small enough $T-M$ will be smaller than $L$,
and we can use ${{T+M}\over 2}$
as the answer. Otherwise, we subdivide the problem into several smaller ones
of the same kind, each with a new set of parameters, and the function calls
itself to solve the smaller problems. In Pascal, a recursive function for this
problem would look like this:
\vskip .125in
{\obeylines
\hskip .25in {\tt FUNCTION AREA (A,C,L:REAL):REAL;
\hskip .25in VAR B,D,T,M:REAL;
\hskip .5in BEGIN
\hskip .5in B:=0.5$\ast$(A+C);
\hskip .5in D:=C-A;
\hskip .5in T:=0.5$\ast$(f(A)+f(C))$\ast$D;
\hskip .5in M:=f(B)$\ast$D;
\hskip .5in IF T-M$<$L THEN AREA:=0.5*(T+M)
\hskip .5in ELSE AREA:=AREA(A,B,L/2)+AREA(B,C,L/2)
\hskip .5in END}}
\vskip .125in
\par\noindent
and the main program could call it, for the particular problem we are solving,
with {\tt WRITE(AREA(0,1,0.01))}.
{\sl Exercise}: What would go wrong if we tried to get $\int↑2↓1e↑{-x↑2}dx$
by calling {\tt AREA(1,2, 0.01)}?
{\sl Exercise}: Modify the {\tt AREA} procedure so it also works if the second
derivative of $f$ is negative throughout $(A,C)$. How would you
use the modified function if you want to get $\int↑5↓0e↑{-x↑2}dx$?
\end
To prepare ordinary English text for typography by TEX, begin by entering
the text into a computer file, with a name of the form f.TEX, where f is
any name you choose. Use freely all printing characters except ${ }# \ ⊗
↑ ↓ % and quotation marks, all of which require special treatment.
Separate paragraphs with a blank line. Single spaces suffice after
periods, etc.; TEX will provide the correct spacing automatically. There
are sections of this introduction to deal with special problems that may
come up; where possible, these sections are self-contained. The sections
are:
Punctuation
Changing Fonts
Spacing
Lines and Boxes
Mathematical Formulas
Indentation
Titles and Headings
Tables
Page Breaks
Footnotes
Scaling
Errors
If you are willing to use the same standard page size, type font, margins,
line spacing, paragraph form, etc., used in this document, you may say so
by putting a line at the beginning of your file that says:
\input basic
If not, see the appropriate sections of this document; all the above must
be specified to use TEX. At the end of your file, put a line that says:
\par\vfill\end
When your file is complete, you may process it with TEX for output on a
Dover or an XGP printer, or on a Datadisc screen.
For output on the Dover printer, type the monitor command
r tex [return]
When the tex program types an asterisk, type
\input f [return]
where f.tex is the name of your file.
If your file contains no TEX errors or omissions, TEX will prepare an
output file called f.dov for printing on the Dover printer, and will
display on the screen a monitor command like
.dover fff.press
to print that file. If you want the printing done, hit [return] and wait
for your Dover output. If not, hit [clear] and do something else.
For output on the XGP printer, give the monitor command
r xgptex [return]
When the tex program types an asterisk, type
\input f [return]
where f.tex is the name of your file. If your file contains no TEX errors
or omissions, TEX will prepare an output file for printing on the XGP
printer, and will display on the screen a monitor command like
.r xgpsyn; f.xgp
If your file contains TEX errors, you will see an error message, perhaps
like.... See the section on Errors for help.
Punctuation
Periods, commas, semicolons, colons, question marks, exclamation points,
parentheses ( ), brackets [ ], and hyphens are all typed normally. If
followed by a space, TEX will print them with the usually correct amount
of space, except that a period in an abbreviation may be followed by too
much space. If so, use \_ rather than _ after the period.
Certain symbols have a special meaning in TEX; to enter them as ordinary
text, they must be preceded by \. Thus to get the symbol in the left
column below, type the sequence in the right column:
\ \\
$ \$
{ \{
} \}
% \%
# \#
To get a single hyphen (e.g. ``twenty-one''), just type it in. To get a
short dash, type two consecutive hyphens. To get a long dash, type three
consecutive hyphens. To get a minus sign, or any other mathematical sign,
see section Mathematical Formulas. (If you actually want two or more
consecutive hyphens, see the TEX manual, section ?). To hyphenate breaks
in long words see the TEX manual, Chapter 14.
To get single left or right quote marks (` '), use the keyboard symbols
` and ' respectively. To get double quotes (`` '') type a pair of
single quotes of the correct slant, not the ditto mark ". To get adjacent
nested quote marks (`` `), see the TEX manual, Chapter . To get an
ellipsis (like one, two, three,...) as shown in the left column, type
one of the following
$\ldots$
$\ldotss$
$\ldotsm$
where the first gives three dots with minimum spacing before and after;
the second is followed by extra space; the third is preceded and followed
by extra space.
Accents
To get the character in the left column below, type the sequence in the
right column. If the accent you want is not shown, see the TEX manual,
Chapter 9 and page 123.
Most of the accents shown on the letter o can be used with other letters
and symbols. The right hand table gives some exceptions.
o \'o a \a_a
A See TEX manual, Ch. 9.
o \`o l \l_l
L See TEX manual, Ch. 9.
o \=o c \c_c
C See TEX manual, Ch. 9.
o \"o i \'\i_
j \'\j_
o \A_o i \=\i_
i \b_i_
o \∨_o j \b_\j_
o \u_o
o \H_o
o \b_o
o \s_o
oo \t_oo
a \ae_
A \AE_
o \oe_
O \OE_
o \o_
O \O_
\ss_
Examples:
'~ (Spanish
c, O (French)
l (Polish)
yu (Russian)
o, B (German)
a, o (Danish)
Ha lich (German: ugly): H\"a\ss lich
canon subterraneo (Spanish: underground canyon): ca\s non
subterr\,aneo
mnozenje (Serbocroatian: multiplication): mno\u zenje
arboker (Norwegian: annals): \a arb\o ker
Measures
Distances and lengths in TEX can be specified in several units; these
include inches, millimeters, and points. To specify a length exactly, for
example the width of the printed region of the page, enter the length as a
number with optional sign and decimal point, followed by the abbreviation
for a unit of measure: in (inches), mm (millimeters), pt (points; about
1/72 in; capital letters in the standard fonts are 10pt high), or ex (the
height of a capital in the current font; this variable measure is useful
for dimensions like interline spacing that usually change when you change
the font size.
Often it is most satisfactory to allow some play in measurements, giving
TEX some freedom to shift symbols around. To allow play, enter a length
as |1| plus |2| minus |3|, where |1| is the preferred length, |2| is the
maximum desirable stretch, and |3| is the maximum permitted shrinkage.
The plus |2| or the minus |3| can be omitted if they are zero.
Examples: \vskip 0.25in leaves 1/4 inch of extra vertical space where it
appears. To right-justify a title, say ``TEX,'' in a line of a given
length, one might specify the length and then enter {\hskip 0pt plus
10000pt TEX}; the allowed stretch of the horizontal skip will fill the
line.
Spacing
When you begin a TEX file with \input basic, standard spacing measures,
the same as those used in this document, are read in. You may override
any of them at any place in your file. If the override is inside TEX
brackets { }, it will only apply up to the closing bracket. In the
patterns given below, | | is a measure, as explained in Measures; examples
are 10pt_ (the height of a capital letter) and 0.25in_ (1/4 inch).
To indent the first line of each paragraph by | |, enter \parindent | |.
To make the first line of each paragraph hang out on the left, as with
numbered paragraphs, use a negative measure. Standard is 20pt.
To get a page layout in which the printed region has width | |, enter
\hsize | |. Standard is 324 points = 4.5 inches.
For a printed region of height | |, enter \vsize | |. Standard is 504
points = 7 inches. On the resulting output page, the printed region will
be horizontally centered.
To set the normal top margin so that the base line (the imaginary line on
which most letters rest) is down | | from the top of the printed region,
enter \topbaseline | |. Standard is 10pt.
To set the normal distance between successive lines of text at | |, where
the distance is measured between baselines and so includes the height of
the letters on the lower line, enter \baselineskip | |. Standard is 12pt.
To set the normal extra distance between lines at the end of a paragraph
to | |, enter \parskip | |. Standard is 0pt, with 1pt of stretch.
To set the separation of oversized lines (e.g. lines that contain
displayed formulas like to | |, enter \lineskip | |. Standard is 1pt.
To set the spacing above and below displayed equations to | |, enter
\dispskip | |. Standard is 12pt, with 3pt of stretch and 3pt of shrink.
Lines
To make a horizontal line of length |1| and of this thickness (0.4pt) at
baseline level, enter \hrule width |1|. For a different thickness, follow
the entry by height |2|, where |2| is the measure of the desired
thickness. To make a vertical line of length |1|, and of this thickness
(0.4pt), enter \vrule height |1| depth 0 For a different thickness, follow
the entry by width |2| where |2| is the desired thickness. For more
details, see TEX manual, Ch. 21.